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RAW SEQUENCE LISTING DATE : 05/02/2001 

PATENT APPLICATION: US/09/836 , 613 TIME: 12:01:56 

Input Set : A:\es.txt CMTCDCH 

Output Set: N:\CRF3\05022001\I836613.raw t IN I t fl IZ U 

SEQUENCE LISTING 

3 (1) GENERAL INFORMATION: 

5 (i) APPLICANT: HOPWOOD, JOHN JOSEPH; SCOTT, *HAMISH STEELE; 

6 WEBER, BIRGIT; BLANCH, LI ANNE; ANSON, DONALD STEWART 
9 (ii) TITLE OF INVENTION: SYNTHETIC MAMMALIAN 

# -N-ACETYLGLUCpSAMINIDASE AND GENETIC SEQUENCES ENCODING 



C--> 29 



12 (iii) NUMBER OF SEQUENCES/: 6 

14 (iv) CORRESPONDENCE ADDRI 

15 (A) ADDRESSEE: NIXON PEABODY LLP 

16 (B) STREET: 990 STEWART AVENUE 

17 (C) CITY: GARDEN CITY 

18 (D) STATE: NEW YORK 

19 ( E) COUNTRY: UNITED STATES 

20 (F) ZIP: 11530 

22 (v) COMPUTER READABLE FORM: 

23 (A) MEDIUM TYPE: Floppy disk 

24 (B) COMPUTER: IBM PC compatible 

25 (C) OPERATING SYSTEM: PC-DOS/MS -DOS 

26 (D) SOFTWARE: Patentln. Release #1.0, Version #1.25 
28 (vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US/09/836,613 

C_ - > 30 (B) FILING DATE: 17-Apr-2001 

32 (Vii) PRIOR APPLICATION DATA: 

33 (A) APPLICATION NUMBER: PCT/US96/0074 7 

34 (B) FILING DATE: 22-NOV-1996 

36 (viii) ATTORNEY/AGENT INFORMATION: 

37 (A) NAME: POKALSKY, ANN R. 

38 (B) REGISTRATION NUMBER: 34,697 

39 (C) REFERENCE/DOCKET NUMBER: 2249/104 

41 (ix) TELECOMMUNICATION INFORMATION: 

42 (A) TELEPHONE: 516 742 4343 

43 (B) TELEFAX: 516 742 4366 
4 7 (2) INFORMATION FOR SEQ ID NO: 1: 

49 (i) SEQUENCE CHARACTERISTICS: 

50 (A) LENGTH: 2575 base pairs 

51 (B) TYPE: nucleic acid 

52 (C) STRANDEDNESS : single 

53 (D) TOPOLOGY: linear 
55 (ii) MOLECULE TYPE: cDNA 

57 (vi) ORIGINAL SOURCE: 

58 (A) ORGANISM: Homo sapiens 

59 (F) TISSUE TYPE: Peripheral Blood 

60 (G) CELL TYPE: Leukocyte 

62 (ix) FEATURE: 

6 3 (A) NAME/KEY: CDS 

64 (B) LOCATION: 102.. 2330 

66 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 
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RAW SEQUENCE LISTING DATE: 05/02/2001 

PATENT APPLICATION: US/09/836,613 TIME: 12:01:56 

Input Set : A:\es.txt 

Output Set: N:\CRF3\05022001\I836613.raw 

67 CCCGGGCTTA GCCTTCGGGT CCACGTGGCC GGAGGCCGGC AGCTGATTGG ACGCGGGCCG 60 

69 CCCCACCCCC TGGCCGTCGC GGGACCCGCA GGACTGAGAC C ATG GAG GCG GTG 113 

70 Met Glu Ala Val 

71 1 

73 GCG GTG GCC GCG GCG GTG GGG GTC CTT CTC CTG GCC GGG GCC GGG GGC 161 

74 Ala Val Ala Ala Ala Val Gly Val Leu Leu Leu Ala Gly Ala Gly Gly 

75 5 10 15 20 

77 GCG GCA GGC GAC GAG GCC CGG GAG GCG GCG GCC GTG CGG GCG CTC GTG 209 

78 Ala Ala Gly Asp Glu Ala Arg Glu Ala Ala Ala Val Arg Ala Leu Val 

79 , 25 30 35 

81 GCC CGG CTG CTG GGG CCA GGC CCC GCG GCC GAC TTC TCC GTG TCG GTG 257 

82 Ala Arg Leu Leu Gly Pro Gly Pro Ala Ala Asp Phe Ser Val Ser Val 

83 40 45 50 

85 GAG CGC GCT CTG GCT GCC AAG CCG GGC TTG GAC ACC TAC AGC CTG GGC 305 

86 Glu Arg Ala Leu Ala Ala Lys Pro Gly Leu Asp Thr Tyr Ser Leu Gly 

87 . 55 60 65 

89 GGC GGC GGC GCG GCG CGC GTG CGG GTG CGC GGC TCC ACG GGC GTG GCG 353 

90 Gly Gly Gly Ala Ala Arg Val Arg Val Arg Gly Ser Thr Gly Val Ala 

91 70 75 80 

93 GCC GCC GCG GGG CTG CAC CGC TAC CTG CGC GAC TTC TGT GGC TGC CAC 401 

94 Ala Ala Ala Gly Leu His Arg Tyr Leu Arg Asp Phe Cys Gly Cys His 

95 85 90 95 " 100 

97 GTG GCC TGG TCC GGC TCT CAG CTG CGC CTG CCG CGG CCA CTG CCA GCC 44 9 

98 Val Ala Trp Ser Gly Ser Gin Leu Arg Leu Pro Arg Pro Leu Pro Ala 

99 105 110 115 

101 GTG CCG GGG GAG CTG ACC GAG GCC ACG CCC AAC AGG TAC CGC TAT TAC 4 97 

102 Val Pro Gly Glu Leu Thr Glu Ala Thr Pro Asn Arg Tyr Arg Tyr Tyr 

103 120 125 130 

105 CAG AAT GTG TGC ACG CAA AGC TAC TCC TTC GTG TGG TGG GAC TGG GCC 54 5 

106 Gin Asn Val Cys Thr Gin Ser Tyr Ser Phe Val Trp Trp Asp Trp Ala 

107 135 140 145 

109 CGC TGG GAG CGA GAG ATA GAC TGG ATG GCG CTG AAT GGC ATC AAC CTG 593 

110 Arg Trp Glu Arg Glu He Asp Trp Met Ala Leu Asn Gly He Asn Leu 
HI 150 155 160 

113 GCA CTG GCC TGG AGC GGC CAG GAG GCC ATC TGG CAG CGG GTG TAC CTG 641 

114 Ala Leu Ala Trp Ser Gly Gin Glu Ala He Trp Gin Arg Val Tyr Leu 

115 165 170 175 180 

117 GCC TTG GGC CTG ACC CAG GCA GAG ATC AAT GAG TTC TTT ACT GGT CCT 689 

118 Ala Leu Gly Leu Thr Gin Ala Glu He Asn Glu Phe Phe Thr Gly Pro 

119 185 190 195 

121 GCC TTC CTG GCC TGG GGG CGA ATG GGC AAC CTG CAC ACC TGG GAT GGC 73 7 

122 Ala Phe Leu Ala Trp Gly Arg Met Gly Asn Leu His Thr Trp Asp Gly 

123 200 205 210 

125 CCC CTG CCC CCC TCC TGG CAC ATC AAG CAG CTT TAC CTG CAG CAC CGG 785 

126 Pro Leu Pro Pro Ser Trp His He Lys Gin Leu Tyr Leu Gin His Arg 

127 215 220 225 

129 GTC CTG GAC CAG ATG CGC TCC TTC GGC ATG ACC CCA GTG CTG CCT GCA 83 3 

130 Val Leu Asp Gin Met Arg Ser Phe Gly Met Thr Pro Val Leu Pro Ala 

131 230 235 240 
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RAW SEQUENCE LISTING DATE: 05/02/2001 

PATENT APPLICATION: US/09/836,613 TIME: 12:01:56 

Input Set : A:\es.txt 

Output Set: N:\CRF3\05022001\I836613.raw 

133 TTC GCG GGG CAT GTT CCC GAG GCT GTC ACC AGG GTG TTC CCT CAG GTC 881 

134 Phe Ala Gly His Val Pro Glu Ala Val Thr Arg Val Phe Pro Gin Val 

135 245 250 255 • 260 

137 AAT GTC ACG AAG ATG GGC AGT TGG GGC CAC TTT AAC TGT TCC TAC TCC 92 9 

138 Asn Val Thr Lys Met Gly Ser Trp Gly His Phe Asn Cys Ser Tyr Ser 

139 265 270 275 

141 TGC TCC TTC CTT CTG GCT CCG GAA GAC CCC ATA TTC CCC ATC ATC GGG 977 

142 Cys Ser Phe Leu Leu Ala Pro Glu Asp Pro He Phe Pro He He Gly 

143 280 285 290 

14 5 AGC CTC TTC CTG CGA GAG CTG ATC AAA GAG TTT GGC ACA GAC CAC ATC 1025 

146 Ser Leu Phe Leu Arg Glu Leu He Lys Glu Phe Gly Thr Asp His He 

147 295 300 305 

149 TAT GGG GCC GAC ACT TTC AAT GAG ATG CAG CCA CCT TCC TCA GAG CCC 1073 

150 Tyr Gly Ala Asp Thr Phe Asn Glu Met Gin Pro Pro Ser Ser Glu Pro 

151 310 315 320 

153 TCC TAC CTT GCC GCA GCC ACC ACT GCC GTC TAT GAG GCC ATG ACT GCA 1121 

154 Ser Tyr Leu Ala Ala Ala Thr Thr Ala Val Tyr Glu Ala Met Thr Ala 

155 325 330 335 340 

157 GTG GAT ACT GAG GCT GTG TGG CTG CTC CAA GGC TGG CTC TTC CAG CAC 1169 

158 Val Asp Thr Glu Ala Val Trp Leu Leu Gin Gly Trp Leu Phe Gin His 

159 345 350 355 

161 CAG CCG CAG TTC TGG GGG CCC GCC CAG ATC AGG GCT GTG CTG GGA GCT 1217 

162 Gin Pro Gin Phe Trp Gly Pro Ala Gin He Arg Ala Val Leu Gly Ala 

163 360 365 370 

165 GTG CCC CGT GGC CGC CTC CTG GTT CTG GAC CTG TTT GCT GAG AGC CAG 1265 

166 Val Pro Arg Gly Arg Leu Leu Val Leu Asp Leu Phe Ala Glu Ser Gin 

167 375 380 385 

169 CCT GTG TAT ACC CGC ACT GCC TCC TTC CAG GGC CAG CCC TTC ATC TGG 1313 

170 Pro Val Tyr Thr Arg Thr Ala Ser Phe Gin Gly Gin Pro Phe He Trp 

171 390 395 400 

173 TGC ATG CTG CAC AAC TTT GGG GGA AAC CAT GGT CTT TTT GGA GCC CTA 1361 

174 Cys Met Leu His Asn Phe Gly Gly Asn His Gly Leu Phe Gly Ala Leu 

175 405 410 415 420 

177 GAG GCT GTG AAC GGA GGC CCA GAA GCT GCC CGC CTC TTC CCC AAC TCC 1409 

178 Glu Ala Val Asn Gly Gly Pro Glu Ala Ala Arg Leu Phe Pro Asn Ser 

179 425 430 435 

181 ACC ATG GTA GGC ACG GGC ATG GCC CCC GAG GGC ATC AGC . CAG AAC GAA 1457 

182 Thr Met Val Gly Thr Gly Met Ala Pro Glu Gly He Ser Gin Asn Glu 

183 440 445 450 

185 GTG GTC TAT TCC CTC ATG GCT GAG CTG GGC TGG CGA AAG GAC CCA GTG 1505 

186 Val Val Tyr Ser Leu Met Ala Glu Leu Gly Trp Arg Lys Asp Pro Val 

187 455 460 465 

189 CCA GAT TTG GCA GCC TGG GTG ACC AGC TTT GCC GCC CGG CGG TAT GGG 1553 

190 Pro Asp Leu Ala Ala Trp Val Thr Ser Phe Ala Ala Arg Arg Tyr Gly 

191 470 475 480 

193 GTC TCC CAC CCG GAC GCA GGG GCA GCG TGG AGG CTA CTG CTC CGG AGT 1601 

194 Val Ser His Pro Asp Ala Gly Ala Ala Trp Arg Leu Leu Leu Arg Ser 

195 485 490 495 500 

198 GTG TAC AAC TGC TCC GGG GAG GCC TGC AGG GGC CAC AAT CGT AGC CCG 164 9 
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SAW SEQUENCE LISTING DATF 1 • n^/nonn,, 

PATENT APPLICATION : US/09/836,613 £2:' L^lf^ 

Input Set : A:\es.txt 

Output Set: N:\CRF3\05022001\l836613.raw 



199 Val Tyr Asn Cys Ser Gly Glu Ala Cys ^ Gly His Asn ^ 

Leu f G f G CTA CAG ATG AA * ACC A <* ATC TGG TAG AAC 

Leu Val Arg Arg Pro Ser Leu Gin Met Asn Thr Ser lie Trp Tyr Asn 



505 510 515 

- CCG TCC CTA CAG ATG AAT ACC AGC ATC TGG 
204 - 52 ~ Pro Ser Leu Gin Met Asn Thr Ser He Trp 

£ 2 2 J 2 2 22 2 2 2 2 2 2 2 S 2 1745 
2 2 2 2 £ 2 2 2 2 2 2 2 2 2 2 2 2 1753 

2 2 2 G " ™ ™ 5f G ? IG ? TC » GC ™ ™= »i « «* gc* M , 



cm sla Val «. ii; £ e ; ;;r s« £ £ £ £ ™ 2 2 1641 

2 2 2 2 2 2 2 2 2 2 2 2 2 5 2 2 5 1885 

590 r Q r 

223 2 Va C r CTG ^ ™ ^ CTG CTG CCG GCA CTG ^AC GAG GTG CTG GCT AGT 
223 Val Leu Ala Tyr Glu Leu Leu Pro Ala Leu Asp Glu Val Leu Ala Ser 



1937 



2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 ™ 



620 fioc; 

2 2 2 2 2 2 2 2 2 2 2 2 2 r f c CGC ™ 2 °" 

232 630 P y Glu Gln Asn Ser Ar 9 Tyr 

o35 640 

lit Gin J 00 TGG GGG CCA GAA GGC AAC AT ^ CTG GAC TAT GCC AAC 2081 

235 Gin Leu Thr Leu Trp Glv Pro filn ri„ aor. ti~ T . 2081 

236 645 65J ly Asn Ile Leu As P T Y r Ala Asn . 

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2129 

S 2 2 2 S 2 2 2 - 2 2 2 - S 2 2 2 21 " 

2 2 2 2 2 2 2 2 2 5 2 2 2 2 2 2 2 2225 

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 £ 2 2273 

g 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2321 



"5 740 



2 2 Ser 2 ™™ aTTC GGCCTTGTTT TCCGCTAATT 23 70 

2 S222 S2SS 2225 SS22 2522 aGG ~ 2130 

266 CCTCCACCAC aox-M, GTGGGATTAA 222? 22SS 2SE2 2J 
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PATENT APPLICATION: US/09/836, 613 

Input Set : A:\es.txt 

Output Set: N:\CRF3\05022001\l836613.raw 



DATE: 05/02/2001 
TIME: 12:01:56 



268 AAAAAAGTCG AGCGGCCGCG AATTC 
272 (2) INFORMATION FOR SEQ ID NO : 2: 



2575 



274 
275 
276 
277 
279 
281 
282 
283 
285 
286 
287 
289 
290 
291 
293 
294 
295 
297 

298 

299 

301 

302 

303 

305 

306 

307 

309 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 743 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
(ix) FEATURE: 

(A) NAME/KEY; 

(B) LOCATION: 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 
(ix) FEATURE: 

S! LOCftT ION : >™ Uy - 9lyC °™ »» 

(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION: 
(ix) FEATURE: 

(A) NAME/KEY: 

(B) LOCATION 



Potentially- glycosylated Asn site, 
261 

Potentially-giycosylated Asn site, 



Potentially-giycosylated Asn site, 
Potentially-giycosylated Asn site, 



Potentially-giycosylated Asn 
526 



site, 



site, 



P°tentially-giycosylated Asn 

31? M 0 f if i} f QUENCE DESCRIPTION: SEQ ID NO: 2: 
312 Met Glu Ala Val Ala Val Ala Ala Ala Val Gly Val Leu Leu Leu Ala 

315 Gly Ala Gly Gly Ala Ala Gly Asp Glu Al'a Arg Glu Ala Ala Al" Val 

318 Arg Ala Leu Val Ala Arg Leu Leu 6 £ Pro Gly Pro Ala Ala Asp Pne 

322 V £ Val G1U A ^ ^ ^ Al- Ala Lys Pro G^y Leu Asp Thr 
324 Tyr Ser Leu Gly Gly Gly Gly Ala Ala Arg Val Arg Val Arg Gly Ser 
327 Thr Gly V al Ala Ala Ala Ala Gly Leu His Arg Tyr Leu Arg Asp Z 

330 Cys Gly Cys His Val Ala Trp Ser Gly sll Gin Leu Arg Leu Pro Arg 
333 Pro Leu p„ Ala yal pro Gly ^ ^ a ^ ^ 110 A ^ ^ 

336 Tyr Tyr Tyr Gln Asn ^ t ^ 125 ^ ^ 

339 Tr p Asp Trp Ala Arg ^ g ne 140 ^ t a ^ l ^ a ^ 

150 15 5 160 



file://C:\CRF3\OuthoId\VsrI836613.htm 

5/2/01 
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PATENT APPLICATION: US/09/836,613 



DATE: 05/02/2001 
TIME: 12:01:57 



Input Set : A:\es.txt 

Output Set: N:\CRF3\05022001\I836613.raw 
L.-939 M:341 W: (46) "n" or "Xaa" used, for SEQ ID# :' 6 
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